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Abstract 

Given a set of mixtures, blind source separation attempts to re- 
trieve the source signals without or with very little information of the 
the mixing process. We present a geometric approach for blind sepa- 
ration of nonnegative linear mixtures termed facet component analysis 
(FCA) . The approach is based on facet identification of the underlying 
cone structure of the data. Earlier works focus on recovering the cone 
by locating its vertices (vertex component analysis or VCA) based on a 
mutual sparsity condition which requires each source signal to possess 
a stand-alone peak in its spectrum. We formulate alternative condi- 
tions so that enough data points fall on the facets of a cone instead 
of accumulating around the vertices. To find a regime of unique solv- 
ability, we make use of both geometric and density properties of the 
data points, and develop an efficient facet identification method by 
combining data classification and linear regression. For noisy data, we 
show that denoising methods may be employed, such as the total vari- 
ation technique in imaging processing, and principle component anal- 
ysis. We show computational results on nuclear magnetic resonance 
spectroscopic data to substantiate our method. 
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1 Introduction 



Blind source separation (BSS) is a major area of research in signal and 
image processing [7J. It aims at recovering source signals from their mix- 
tures with minimal knowledge of the mixing environment. The applica- 
tions of BSS range from engineering to neuroscience. A recent emerging 
research direction of BSS is to identify chemical explosives and biologi- 
cal agents from their spectral sensing mixtures recorded by various spec- 
troscopy such as Nuclear Magnetic Resonance (NMR), Raman spectroscopy, 
Ion- mobility spectroscopy (IMS), and differential optical absorption spec- 
troscopy (DOAS),etc. Spectral sensing is a critical area in national security 
and a vibrant scientific area. The advances of modern imaging and spectro- 
scopic technology have made it possible to classify pure chemicals by their 
spectral features. However, mixtures of chemicals subject to changing back- 
ground and environmental noise pose additional challenges. The goal of this 
paper is to develop a BSS method to process data in the presence of noise 
based on geometric spectral properties. 

To separate the spectral mixtures, one needs to solve the following matrix 
decomposition problem 

X = AS + N, (1.1) 

where A € j^ mx « j s a f u n ran k unknown basis (dictionary) matrix or the 
so called mixing matrix in some applications, N 6 M mx P is an unknown 
noise matrix, S = [s(l), . . . , s(p)] 6 R nxp is the unknown source matrix 
containing signal spectra in its rows. Here p is the number of data samples, 
m is the number of observations, and n is the number of sources. Various 
BSS methods have been proposed based on a priori knowledge of source 
signals such as statistical independence, sparseness, nonnegativity, [H El [7j 
HB H2l H51 HQ US HOI US among others. As a matrix factorization 
problem in the noise free case (N = 0), BSS has permutation and scaling 
ambiguities in its solutions similar to factorizing a large number into product 
of primes. For any permutation matrix P and invertible diagonal matrix A, 
(APA, A^P^S) is another pair equivalent to the solution (A,S), since 

X = AS = (^PA)(A" 1 P~ 1 5). (1.2) 

Recently there has been active research on BSS by exploiting data ge- 
ometry PiaElIigEOlEIJEaEa^ For simplicity, let N = 
in (ll.lD . The geometric observation [TJ [20| [27] is that if each row of S has 
a dominant peak at some location (column number) where other rows have 
zero elements, then the problem of finding the columns of the mixing matrix 
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A reduces to the identification of the edges of a minimal cone containing the 
columns of mixture matrix X. In hyperspectral imaging (HSI), the condi- 
tion is known as pixel purity assumption (PPA, [5]). In other words, each 
pure material of interest exists by itself somewhere on the ground. The PPA 
based convex cone method (known as N-findr [27]) is now a benchmark in 
HSI, see [5l [20l [2"T| [2"2j [23] for its more recent variants. The method termed 
vertex component analysis (VGA) proposed in [22J is worth mentioning here 
being a fast unmixing algorithm for hyperspectral data. In Nuclear Mag- 
netic Resonance (NMR) spectroscopy which motivates our work here, PPA 
was reformulated by Naanaa and Nuzillard in [20]. The source signals are 
only required to be non-overlapping at some locations of acquisition vari- 
able (e.g. frequency) which leads to a dramatic mathematical simplification 
of a general non-negative matrix factorization problem which is non- 

convex [T2]. More precisely, the source matrix S > is assumed to satisfy 
the following 

Assumption 1 (NNA). : For each i E {1,2, ... ,n} there exists an ji £ 
{1, 2, . . . ,p] such that Sij i > and Skj t = (k = 1, . . . ,i — l,i + 1, . . . , n) . 

Simply put, the stand-alone peaks possessed by each source allow for- 
mation of a convex cone enclosing all the columns of X, and the edges (or 
vertices) of the cone are the columns of the mixing matrix. To illustrate 
the idea, let us consider an example of three sources and three mixtures 
(A e M 3x3 , X £ R 3x p, S e R 3xp , and A is non-singular). 

[X(:,1),X(:,2),... ,X(:,p)] 



(* • • • * 1 * 
* ■■• * 1 * 
* • • • * 1 * 




where X(k, :) represents the fe-th row of X, 1 represents nonzero entry, and 
the three l's are three stand-alone peaks. It can be seen that A's columns 
are actually among those of X's up to constants, and other columns of X 
are nonnegative linear combinations of columns of A. Geometrically, the 
columns of A span a convex cone enclosing all X's columns. The estimation 
of A is equivalent to the identification of this cone. As a matter of fact, 
the vertices of the cone are the columns of A. Fig. Q] shows this vertex- 
represented cone containing all data points. To find the vertices (or edges) 
of the cone, the following optimization problem is solved for each column k 
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Figure 1: A cloud of data points (columns of X), left plot, is rescaled to lie 
on a plane determined by three vertices of the cone (right plot). 

of X: 

p 

c = min Xj 

p 

s.t. Yl X(:,j)\ j =X(:,k) ,\j>0. 

It is shown [9] that X(:,k) is an edge of the convex cone if and only if the 
optimal objective function value c* is greater than 1. 

If the data are contaminated by noise, the following optimization problem 
is solved for each k [20] : 

p 

min score = min | X(:,j)Xj — X(:, (1.3) 

A score is associated with each column of X. Columns with low scores are 
unlikely to be a column of A because this column is roughly a nonnegative 
linear combination of the other columns of X. On the other hand, a high 
score means that the corresponding column is far from being a nonnegative 
linear combination of other columns. The n rows from X with highest scores 
are selected to form A, the mixing matrix. 
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Though the vertex based convex cone method above is geometrically 
elegant, the working condition is still restrictive. The success of the cone 
method highly depends on the recognition of its vertices. If PPA or NNA 
is violated, the vertices are not in the data matrix X, and may not be the 
primary objects for identification. In this paper, we consider such a scenario 
where data points (scaled columns of X) lie either inside or on the facets of a 
convex cone, yet none of them are located on edges (vertices), see Fig. [5] for 
an example. The cone structure will be reconstructed from its facets instead 
of the vertices. A facet component analysis (FCA) should be pursued for 
the data considered. The appropriate source condition for this case is that 
most columns of source matrix S (m rows) possess m — 1 nonzero entries. 
We shall pursue a more precise study on the source condition for solvability 
in the next section. The problem above calls for an identification method of 
flat submanifolds from a point cloud, as the facets of a convex cone in general 
lie on hyperplanes. The vertices are obtained from the intersections of the 
hyperplanes expanded from the facets. We shall study the identification of 
these flat manifolds and the subequent recovery of the source signals. The 
extraction of meaningful geometric data structures help the data matrix 
factorization and reduce the computational cost. 

Recently, a dual cone based approach to BSS problem was proposed in 
|21| . The method first calculates the dual of the data cone, then selects a 
subset of the source signals from a set of source candidates by means of a 
greedy process. Specifically, the first step consists in computing a generating 
matrix of the dual of the data cone by the double description method [19] . 
The second step consists in extracting an estimate of the source matrix 
from a larger matrix which is the product of the generating matrix with its 
transpose. Multiple solutions can be obtained in the second step, so the 
author imposed an additional orthogonality constraint on the source sigals 
to get a sub-optimal solution. Overall, the method proposed in [21] appears 
indirect and computationally unwieldy, besides requiring the orthogonality 
of source signals. Here we opt to solve the problem directly by identifying 
the facets of the data cone under a unique solvability condition. 

The rest of the paper is organized as follows. In section 2, we propose 
a new source condition for solving (|1 .2j) based on the geometric structure 
of the data points, and develop an algorithm for facet identification and 
reconstruction of the associated convex cone. In section 3, we present com- 
putational examples and results. For noisy data, we show that a denoising 
method maybe needed to help remove or reduce the noise. A denoising 
method based on total variation and distance function is discussed. Con- 
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Figure 2: Scatter plots of columns of X. The cloud of data point (left) 
rescaled to lie on a plane determined by the three vertices of the cone (right). 



eluding remarks and future work are in Section 4. 

The work was partially supported by NSF-ATD grant DMS-0911277. 

2 Proposed Method 

Let us consider the noiseless case in order to illustrate the basic idea behind 
our method. The first source condition on the problem is as follows: 

Assumption 2. S contains at least m — 1 linearly independent column 
vectors orthogonal to each unit coordinate vector e%, % = 1, ... ,m. 

For the case m = 3, we have three mixtures and three sources (i.e, 
i£R 3x3 ,X G R Sxp , S G M 3xp , and A is non-singular), one example can be: 



[X(:,1),X(:,2),-.. 



X(:,p)] 




*120012* 
*211200* 
*002121* 



We note in passing that PPA or NNA (stand-alone peak) assumption actu- 
ally is a special (more restrictive) case of the assumption [2] above. 



Let M be a matrix of R mxp , the subset of R m defined by 

M = cone(M) = {M a \a ^ 0} 

is a convex cone, and M is said to be a generating matrix of M since every 
element of M is a nonnegative linear combination of M columns. Let X = 
cone(X) and A = cone(A). Under Assumption [21 we have the following (or 
combining Lemma 3 and Lemma 5 of |21j): 

Theorem 1. If X = AS, and A, S 0, then X Q A. Moreover, each facet 
of A contains a facet of X . 

For readers' convenience, a short proof is given below. 

Proof. Vx € X, let x = X a, a 0. So x = A S a = A (S a), where S a fc= 0. 
So clearly x£A 

The second claim follows as we notice that: (1) A has m facets and each 
one is spanned by m — 1 column vectors of A; (2) Using Assumption [2j 
X = AS has at least m — 1 linearly independent column vectors located in 
each facet of A; (3) X C A □ 

Let x = (xi, . . . , x m ) T , 1 = (1, . . . , 1) T . Since A is nonsingular, >l has 

m extreme directions and thus has ( 171 ,) = m facets. Based on Theorem 

\m— 1/ 

[H our method aims to identify the m facets of X contained in the facets 
of A. If we project X's column vectors onto the plane x T -1 = 1 (i.e. L\- 
normalize the column vectors), the resulting data points together with the 
origin O = (0, . . . , 0) has a m-dimensional convex hull with I facets(/ > 
m + 1). Let us denote it by Conv(X). In fact, Conv(X) results from the 
cone X being truncated by the plane x T 1 = 1. We then acquire all the 
facets and the associated vertices of Conv(X). This can be done by means 
of the function 'Convhulln' from the Matlab library. It is based on 'Qhull' 
which implements the Quickhull algorithm for computing the convex hull. 

One of the facets is contained in the plane x T 1 = 1, and we call it 
trivial facet. The other I — 1 facets having O as one of their vertices are 
called nontrivial facet. By Theorem [TJ X has at least the same number 
of nontrivial facets as A. In fact, it has more nontrivial facets than A in 
most cases. If we randomly choose m nontrivial facets from those of X, 
the solutions are clearly nonunique. Hence we need an additional source 
assumption to provide a selection criterion: 

Assumption 3. The m facets of X containing the largest numbers of data 
points are contained in the m facets of A. 
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Figure 3: Left: the cone A and the data points. Right: the convex hull 
Conv(A) and the convex hull Conv(X) formed by rescaled data points. 

By assumption we count and sort the number of data points in each 
nontrivial facets of Conv(X) followed by selecting the m facets with the 
largest numbers of data points. Each of the facets is actually contained in 
one facet of A. So the intersection of any m — 1 facets out of the acquired 
m facets is an edge of A. By intersecting all m edges with the hyperplane 
x T • 1 = 1, we obtain the m column vectors of the mixing matrix A. Finally, 
nonnegative least squares method yields the source matrix S. 

Based on the ideas above, we now include the additive noise and put 
forward the algorithm under assumptions [2]l3] as follows: 

Algorithm 1 (Face Component Analysis). (A,S) = FCA(X, p,e,a,S); pa- 
rameters p > 0; e, <T, 5 € (0, 1). 

1. (Preprocessing) Xq = max{A", 0}. If \\Xq\\2 < p, delete Xq from Xq, 

denote by Xq the resulting matrix. Project each Xq onto the plane 
^ ■ 1 = 1. 

2. (Convex hull) Add the origin O as the first column to Xq. Return 
all the facets and vertices of Conv(X), keep only the nontrivial facets. 
Denote by F{ the ith facet and Vi the set of its vertices. 

3. (Grouping) Initialize Gi = Vi. If Xq Gi and dist(Xo ,Fi) < e w 
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p = 1e-10 



p = 1 e-4 



p = 2e-3 




Figure 4: Comparison of denoising effects with different p. 

and dist{Xo , Vj) > a ~ 0, add Xq to Gi. 

4- (Plane fitting) For each Gi, obtain the equation of its fitting plane 
denoted as a? ■ bi = 0, where bi is the normal vector with norm 1. 
Select m planes from Gi 's with the largest cardinalities such that the 
inner product of any two bis < 5 ~ 1. 

5. (Intersecting) Obtain the m intersections of any m — 1 planes out of m 
planes from step 4 with the plane x T ■ 1 = 1, and form the m columns 
of mixing matrix A. 

6. ( Source recovering ) For each column Xq , solve the following optimiza- 
tion problem to find the corresponding column S 3 of S as: 

minimize \\Xq — AS^\\2, 

subject to Si !>= 

Remark 2. 1. We can denoise by increasing the threshold p in step 1. 
The effects of denoising with varied p are present in Fig. [|| We add 
white Gaussian noise with signal-to-noise ratio (SNR) = 80 dB to three 
noiseless mixtures. In the left plot of Fig. an outside triangle forms 
due to the intersections of the plane x + y + z = 1, x-plane, y-plane and 
z-plane. When noiseless signal is corrupted by additive white Gaussian 
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noise, many data points with norms of almost tend to get negative 
entries. Then we apply the threshold of to X's entries in step 1. 
Thus these points fall into x-plane, y-plane or z-plane. If we choose 
a very very small p = 10~ 10 as in the left plot, these points will still 
remain (i.e. hardly any denoising effect). So projecting them onto the 
plane x + y + z = 1 forms the outside triangle. It is awful to get such a 
geometric structure because it is impossible to identify A. However, if 
p goes up to 2 x 10~ 3 as in the right plot, the structure of A emerges. 
Clearly a better geometric structure of data points is achieved by larger 
p. However, we also need to avoid p being too large in that we may 
lose the structure of Conv(X) with few data points. 

2. Considering the presence of noise, we modify the criterion of one point 
belonging to a facet as follows. The point is not required to lie exactly 
on the facet. Instead, we only require that it is near the facet yet not 
around the vertices of the facet. The thresholds e and a in the following 
section of numerical experiments vary from 10 -6 to 0.1 depending on 
the level of noises. If there is almost no noise, we can achieve perfect 
recovery by setting these two thresholds to be any values less than 10~ 5 . 
A higher level of noises demands larger values of e and a. These two 
parameters are usually of the same order. 

3. In step 4, we introduce the threshold 5 to avoid selecting two nearly 
coplanar fitting planes which may actually correspond to the same facet 
of Conv(X). Normally we set 5 to be 0.99. 

4- We will introduce denoising methods by smoothing filters such as box 
filter, Gaussian filter, and total variation (TV denoising) in next sec- 
tion. The combination of these denoising methods and our FCA tend 
to perform better than FCA alone when the noise level is high. We em- 
bed the noise filter into FCA after step 3 where we are able to preserve 
the data points close to the facets of Conv(X) only. Then applying the 
denoising methods yield a more desirable geometric structure of data 
points. 

3 Numerical Experiments 

We report the numerical results of our algorithm in this section. The data we 
have tested include real-world NMR spectra as well as synthetic mixtures. 
All the entries of the mixing matrices and the sources matrices are positive. 
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As we have pointed out in the last section, NN's assumption is a special 
case of ours. So our algorithm is supposed to work for the separations of 
NN data. The first example is to recover three sources from three mixtures, 
where the source signals have stand-alone peaks. For the data, we used true 
NMR spectra of four compounds /3-cyclodextrine,/3-sitosterol, and menthol 
as source signals. The NMR spectrum of a chemical compound is produced 
by the Fourier transformation of a time-domain signal which is a sum of 
sine functions with exponentially decaying envelopes [TO]. The real part of 
the spectrum can be presented as the sum of symmetrical, positive valued, 
Lorentzian-shaped peaks. The NMR reference spectra of /?-cyclodextrine,/3- 
sitosterol, and menthol are shown in the top panel of Fig. [6] from left to 
right. For the parameters, we set p = 10~ 3 ,e = a = 10~ 6 ,<5 = 0.99. Fig. 
[5] shows the geometric structure of the mixtures and mixing matrix. The 
reference spectra and computational results are shown in Fig. [6l A\ is the 
rescaled true mixing matrix, while A\ is the computed mixing matrix via 
our method. Apparently the separation results are very nice in that A\ and 
A\ are identical. 



In a second example, there are three Lorentzian source signals which no 
longer satisfy NN assumption. We use them to create three noisy mixtures 
by adding white Gaussian noise with SNR = 50 dB. A good result is achieved 
by setting p = 50, e = 5 x 10" 3 ,o- = 6 x 10~ 3 ,5 = 0.99. Fig. [7] and Fig. E 
show the geometry of data points and recovery results. True mixing matrix 
A2 and computed A2 are as below (first row of A2 is scaled to be same as 
that of A 2 ) 






0.0769 0.4615 0.3571 
0.3846 0.4615 0.0714 
0.5385 0.0769 0.5714 
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Figure 5: Rescaled columns of mixture matrix and columns of mixing matrix 
from Example 1. 
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Figure 6: Top row: from left to right, the three reference spectra of f3- 
cyclodextrine, /3-sitosterol, and menthol. Bottom row: recovery results by 
our method. 
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Figure 7: Rescaled columns of mixture matrix and columns of mixing matrix 
from Example 2. Some noisy data points are already deleted from X by step 
1 of Algorithm [TJ Parameters are p = 50, e = 5xl0~ 3 , a = 6xl0~ 3 ,<5 = 0.99. 

To provide further insight into how to choose the parameters, two results 
caused by inappropriate threshold values are presented as well in Fig. [9l Fig. 
[10] and Fig. [TTJ In the first experiment, some noisy data points destroy the 
geometric structure as shown in Fig. [9] since p is not big enough to filter out 
them. In the second one, e and a are too small for the level of noises. 

The last example concerns BSS in higher dimensions. We manage to re- 
cover four sources from four mixtures. The data points are in 4-dimensional 
space, so they are difficult to visualize. We chose p = 1, e = 2 x 10~ 5 ,<t = 
10 _5 ,5 = 0.99 here. The original sources and recovered ones are present in 
Fig. [12j The true and computed mixing matrices are shown below (first row 
of A3 is scaled to be same as that of A3). 



A3 



I 0.1923 0.2500 0.2632 0.1000 \ 

0.1923 0.2500 0.2105 0.2000 

0.2692 0.3750 0.4211 0.3000 

\ 0.3462 0.1250 0.1053 0.4000 / 



A^ 



I 0.1000 0.2500 0.2632 0.1923 \ 

0.1997 0.2500 0.2107 0.1922 

0.2992 0.3749 0.4211 0.2694 

\ 0.4011 0.1252 0.1057 0.3456 / 
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Figure 8: Top row: the true source signals. Bottom row: computed source 
signals via our method. Prameters are p = 50, e = 5x 10 -3 , a = 6 x 10~ 3 , 5 = 
0.99. 




Figure 9: Example 2: p = 5, e = 5 x 10~ 3 , a = 6 x 10~ 3 , 6 = 0.99. 
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Figure 12: Top row: the four original source signals. Bottom row: the 
recovered sources. 

To test the performance of our method, we compute the Comon's index 
[8]. The index is defined as follows: let A and A be two nonsingular matrices 
with L2-normalized columns. Then the distance between A and A denoted 
by e(A, A) which reads 



i 


3 


2 

+£ 

j 


El^il" 1 

i 


2 

+£ 

i 


3 


+£ 

3 


Ei^-i 2 - 1 

i 



where D = A _1 A, and dij is the entry of D. In [8] Comon proved that A 
and A are considered nearly equivalent in the sense of BSS (i.e., A = AP A) 
if e(A, A) ~ 0. Fig. [13] and Fig. Q3] show Comon's indices between the 
true mixing matrices and the computed matrices by our method. For the 
result in Fig. [T3l we compute the Comon's indices using the four sources 
in Example 3 and 30 4 x 4 random mixing matrices. Clearly the Comon's 
indices are very small suggesting the equivalence in the sense of BBS of the 
true mixing matrices and the computed ones. Fig. [T4l shows the performance 
of our method in the presence of noise. The three sources in Example 2 are 
combined to generate three noisy mixtures by adding white Gaussian noises 
with SNR varying from 16 dB to 50 dB. The reliability of our method is 
manifested from the plot. 
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Figure 13: Performance of our method on 30 random 4x4 mixing matrices. 
The four sources in Example 3 are used. 




3.1 Denoising 

If there is considerable noise in the data, it would be desirable to reduce 
or remove the noise before feeding them to the proposed method. We shall 
propose to apply imaging denoising techniques to the FCA for noise reduc- 
tion or removal. These denoising methods may be used when the noise level 
is high, and they can be combined with the FCA after step 3. In image 
processing, the simplest denoising method is the sliding mean or box filter 
[18j . Each pixel value is replaced by the mean of its local neighbors. The 
Gaussian filter is similar to the box filter, except that the values of the neigh- 
boring pixels are given different weighting, that being defined by a spatial 
Gaussian distribution. The Gaussian filter is probably the most widely used 
noise reducing filter. 

These local image smooth filters could be applied for the point cloud 
noise reduction with slight modification. They are incorporated in to FCA 
after step 3 where the groups Gj have been obtained. Within each group, 
the X-nearest neighbors are searched based on Euclidean distance, the best 
choice of K depends upon the data and noise level. For example, fewer 
neighbors should be selected when less noise presents. Note that these de- 
noising methods generalize to any dimensional point cloud, they are easy to 
implement. A disadvantage is that they tend to smooth away edge or corner 
structures while reducing the noise. To overcome this shortcoming, we pro- 
pose to apply the total variation idea of image denoising for noise removal. 
It is based on the principle that signals with excessive noise have high total 
variation, that is, the integral of the absolute gradient of the signal is high. 
According to this principle, reducing the total variation of the signal subject 
to it being a close to the original signal, removes the unwanted detail whilst 
preserving important details such as edges. The concept was originated in 
Rudin, Osher, and Fatemi in 1992 [28], and it can be applied to the point 
cloud noise removal. In the following, we shall use the example of a point 
cloud in xyz plane to illustrate the idea of total variation denoising. We first 
preprocess the data by rescaling them onto a plane x + y + z = 1, the pro- 
jected data are two dimensional. For each point (x,y), a distance function 
to the point cloud (data points) is defined as 

d(x, y) = min y/ (x - Xi) 2 + (y - yi) 2 , 

where (xj,yj) T corresponds to the i-th column of X, note that Z{ is not 
included since the Zi = 1 — Xj — yi. Fig. [TdI shows an example of distance 
function to a unit circle, the left plot is the distance function with no noise, 
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Figure 15: The distance function with and without noise. 



while the right plot is the function with noise. For computation, the dis- 
tance function will be restricted on a rectangular region which contains the 
point cloud. Note that d(x, y) > 0, the distance function actually defines 
intensities of an image. Clearly, the set of S = {(x,y) : d(x,y) = 0} is the 
point cloud for the noiseless case. 

We proceed to solve the Rudin-Osher-Fatemi (ROF) model to obtain a 
denoised distance function u(x,y) 

minTV(n) + X/2\\d - u\\l, 

u 

where TV{u) is the total variation of u defined as 

TV(u) = ^2 yVmj - u i,j\ 2 + \ u i,j+i ~ u i,j\ 2 
for an image. A variation for ease of minimization is: 

TV(u) = 22 \ u i+i,j ~ u i,j \ + \ u i,j+l - u i,j\ ■ 

We shall use the recent Chambolle's Algorithm [3] to solve this minimization 
problem. Note that the smaller the parameter A, the stronger the denoising. 
Then the zero level set of the resulting minimizer u(x,y) will be taken as the 
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Figure 16: The point cloud before (left) and after (right) denoising 



denoised point cloud. In the real calculation, we will consider the following 
set with threshold S = {(x, y) : u(x,y) = r} where r takes on a tiny value. 
The noisy point cloud and the result after the noise removal are depicted in 
Fig. [TBI The detected planes from both of them are shown in Fig. [T71 and 
their intersections, i.e., the vertexes of the cone. It can be noted that total 
variation denoising is very effective at preserving edges (thick lines in the 
figures) whilst smoothing away noise in flat regions. The idea of denoising 
distance function by total variation extends to point cloud of any dimension. 

We conduct experiments on performances of FCA with and without TV 
denoising were conducted. A comparison of their performance is showed 
in Fig. fl8l The mixtures are corrupted by additive white Gaussian noises 
varying from 16 dB to 25 dB. In general, TV denoising lowers Comon's 
indices at high noise level resulting in better separation. 



4 Concluding Remarks 

In this paper, we developed a novel BSS method based on facet component 
analysis. We presented a facet based solvability condition for the unique 
solvability of the nonnegative blind source separation problem up to scaling 
and permutation. Our approach exploited both the geometry of data matrix 
and the sparsity of the source signals. Numerical results on NMR signals 
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Figure 17: Computational results of planes from noisy data (left) and de- 
noised data (right). The Green lines are the detected planes using the 
method proposed in the paper. 




SNR(dB) 



Figure 18: Compare Comon's indices of FCA with and without TV denosing 
at high noise levels. 
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validated the solvability condition, and showed satisfactory performance of 
the proposed algorithm. For noisy data, total variation denoising method 
serves as a viable preprocessing step. 

A line of future work is to separate more source signals from their mix- 
tures, known as an undetermined blind source separation, or uBSS. This 
problem presents more challenge than the determined or over-determined 
BSS in that the mixing matrix is non-invertible. Some recent study has 
been done by two of the authors based on a geometric approach to retrieve 
the mixing matrix under suitable solvability conditions |24j . 
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